Register stack engine having speculative load/store modes

ABSTRACT

A computer system is provided having a register stack engine to manage data transfers between a backing store and a register stack. The computer system includes a processor and a memory coupled to the processor through a memory channel. The processor includes a register stack to store data from one or more procedures in one or more frames, respectively. The register stack engine monitors activity on the memory channel and transfers data between selected frames of the register stack and a backing store in the memory responsive to the available bandwidth on the memory channel.

BACKGROUND OF THE INVENTION

[0001] 1. Technical Field

[0002] The present invention relates to microprocessors and, inparticular, to mechanisms for managing data in a register file.

[0003] 2. Background Art

[0004] Modem processors include extensive execution resources to supportconcurrent processing of multiple instructions. A processor typicallyincludes one or more integer, floating point, branch, and memoryexecution units to implement integer, floating point, branch, andload/store instructions, respectively. In addition, integer and floatingpoint units typically include register files to maintain data relativelyclose to the processor core.

[0005] A register file is a high speed storage structure that is used totemporarily store information close to the execution resources of theprocessor. The operands on which instructions operate are preferentiallystored in the entries (“registers”) of the register file, since they canbe accessed more quickly from these locations. Data stored in larger,more remote storage structures such as caches or main memory, may takelonger to access. The longer access times can reduce the processor'sperformance. Register files thus serve as a primary source of data forthe processor's execution resources, and high performance processorsprovide large register files to take advantage of their low accesslatency.

[0006] Register files take up relatively large areas on the processor'sdie. While improvements in semiconductor processing have reduced thesize of the individual storage elements in a register, the wires thatmove data in and out of these storage elements have not benefited to thesame degree. These wires are responsible for a significant portion ofthe register file's die area, particularly in the case of multi-portedregister files. The die area impact of register files limits the size ofthe register files (and the number of registers) that can be usedeffectively on a given processor. Although the number of registersemployed on succeeding processor generations has increased, so has theamount of data processors handle. For example, superscalar processorsinclude multiple instruction execution pipelines, each of which must beprovided with data. In addition, these instruction execution pipelinesoperate at ever greater speeds. The net result is that the registerfiles remain a relatively scare resource, and processors must manage themovement of data in and out of these register files carefully to operateat their peak efficiencies.

[0007] Typical register management techniques empty registers to andload registers from higher latency storage devices, respectively, tooptimize register usage. The data transfers are often triggered whencontrol of the processor passes from one software procedure to another.For example, data from the registers used by a first procedure that iscurrently inactive may be emptied or “spilled” to a backing store if anactive procedure requires more registers than are currently available inthe register file. When control is returned to the first procedure,registers are reallocated to the procedure and loaded or “filled” withthe associated data from the backing store.

[0008] The store and load operations that transfer data between theregister file and backing store may have relatively long latencies. Thisis particularly true if the data sought is only available in one of thelarge caches or main memory or if significant amounts of data must betransferred from anywhere in the memory hierarchy. In these cases,execution of the newly activated procedure is stalled while the datatransfers are implemented. Execution stalls halt the progress ofinstructions through the processor's execution pipeline, degrading theprocessor's performance.

[0009] The present invention addresses these and other problems relatedto register file management.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] The present invention may be understood with reference to thefollowing drawings, in which like elements are indicated by likenumbers. These drawings are provided to illustrate selected embodimentsof the present invention and are not intended to limit the scope of theinvention.

[0011]FIG. 1 is a block diagram of one embodiment of a computer systemthat implements the present invention.

[0012]FIG. 2 is a block diagram representing one embodiment of aregister management system in accordance with the present invention.

[0013]FIG. 3 is a schematic representation of register allocationoperations for one embodiment of the register file of FIG. 1.

[0014]FIG. 4 is a schematic representation of the operations implementedby the register stack engine between the backing memory and the registerfile of FIG. 1.

[0015]FIG. 5 is a flowchart representing one embodiment of the method inaccordance with the present invention for speculatively executingregister spill and fill operations.

[0016]FIG. 6 is a state machine representing one embodiment of theregister stack engine in accordance with the present invention.

DETAILED DESCRIPTION OF THE INVENTION

[0017] The following discussion sets forth numerous specific details toprovide a thorough understanding of the invention. However, those ofordinary skill in the art, having the benefit of this disclosure, willappreciate that the invention may be practiced without these specificdetails. In addition, various well-known methods, procedures,components, and circuits have not been described in detail in order tofocus attention on the features of the present invention.

[0018] The present invention provides a mechanism for managing thestorage of data in a processor's register files. The mechanismidentifies available execution cycles in a processor and uses theavailable execution cycles to speculatively spill data from and filldata into the registers of a register file. Registers associated withcurrently inactive procedures are targeted by the speculative spill andfill operations.

[0019] For one embodiment of the invention, the speculative spill andfill operations increase the “clean partition” of the register file,using available bandwidth in the processor-memory channel. Here, “cleanpartition” refers to registers that store valid data which is alsobacked up in the memory hierarchy, e.g. a backing store. These registersmay be allocated to a new procedure without first spilling them becausethe data they store has already been backed up. If the registers are notneeded for a new procedure, they are available for the procedure towhich they were previously allocated without first filling them from thebacking store. Speculative spill and fill operations reduce the need formandatory spill and fill operations, which are triggered in response toprocedures calls, returns, returns from interrupts, and the like.Mandatory spill and fill operations may cause the processor to stall ifthe active procedure can not make forward progress until the mandatoryspill/fill operations complete.

[0020] One embodiment of a computer system in accordance with thepresent invention includes a processor and a memory coupled to theprocessor through a memory channel. The processor includes a stackedregister file and a register stack engine. The stacked register filestores data for one or more procedures in one or more frames,respectively. The register stack engine monitors activity on theprocessor-memory channel and transfers data between selected frames ofthe register file and a backing store responsive to the availablebandwidth in the memory channel. For example, the register stack enginemay monitor a load/store unit of the processor for empty instructionslots and inject speculative load/store operations for the register filewhen available instruction slots are identified.

[0021]FIG. 1 is a block diagram of one embodiment of a computer system100 in accordance with the present invention. Computer system 100includes a processor 110 and a main memory 170. Processor 110 includesan instruction cache 120, an execution core 130, one or more registerfiles 140, a register stack engine (RSE) 150, and one or more datacaches 160. A load/store execution unit (LSU) 134 is shown in executioncore 130. Other components of processor 110 such as rename logic,retirement logic, instruction decoders, arithmetic/logic unit(s) and thelike are not shown. A bus 180 provides a communication channel betweenmain memory 170 and the various components of processor 110.

[0022] For the disclosed embodiment of computer system 100, cache(s) 160and main memory 190 form a memory hierarchy. Data that is not availablein register file 140 may be provided by the first structure in thememory hierarchy in which the data is found. In addition, data that isevicted from register file 140 to accommodate new procedures may bestored in the memory hierarchy until it is needed again. RSE 150monitors traffic on the memory channel and initiates data transfersbetween register file(s) 140 and the memory hierarchy when bandwidth isavailable. For example, RSE 150 may use otherwise idle cycles, i.e.empty instruction slots, on LSU 134 to speculatively execute spill andfill operations. The speculative operations are targeted to increase theportion of data in register file 140 that is backed up in memory 190.

[0023] For one embodiment of the invention, register file 140 islogically partitioned to store data associated with different proceduresin different frames. Portions of these frames may overlap to facilitatedata transfers between different procedures. To increase the number ofregisters available for use by the currently executing procedure, RSE150 speculatively transfers data for inactive procedures betweenregister file 140 and the memory hierarchy. For example, RSE 150 maystore data from registers associated with inactive procedures(RSE_Store) to a backing memory. Here, an inactive or parent procedureis a procedure that called the current active procedure either directlyor through one or more intervening procedures. Speculative RSE_Storesincrease the probability that copies of data stored in registers isalready backed up in the memory hierarchy should the registers be neededfor use by an active procedure. Similarly, RSE 150 may load data fromthe memory hierarchy to registers that do not currently store valid data(RSE_Load ). Speculative RSE_Loads increase the probability that thedata associated with an inactive (parent) procedure will be available inregister file 140 when the procedure is re-activated.

[0024]FIG. 2 is a schematic representation of a register managementsystem 200 that is suitable for use with the present invention. Registermanagement system 200 includes register file 140, RSE 150, a memorychannel 210 and a backing store 220. Backing store 220 may include, forexample, memory locations in one or more of cache(s) 160 and main memory170. Memory channel 210 may include, for example, bus 180 and/or LSU134.

[0025] RSE 150 manages data transfers between stacked register file 140and backing store 220. The disclosed embodiment of RSE 150 includesstate registers 280 to track the status of the speculative and mandatoryoperations it implements. State registers 280 may indicate the nextregisters targeted by speculative load and store operations(“RSE.LoadReg” and “RSE.StoreReg”, respectively), as well as thelocation in the backing store associated with the currently activeprocedure (“RSE.BOF”). Also shown in FIG. 2 is an optional mode statusbit (“MSB”) that indicates which, if any, of the speculative operationsRSE 150 should implement. These features of RSE 150 are discussed belowin greater detail.

[0026] The disclosed embodiment of register file 140 is a stackedregister file that is operated as a circular buffer (dashed line) tostore data for current and recently active procedures. The embodiment isillustrated for the case in which data for three procedures, ProcA,ProcB and ProcC, is currently being stored. The figure represents thestate of register file 140 after ProcA has called ProcB, which has inturn called ProcC. Each process has been allocated a set of registers instacked register file 140.

[0027] In the exemplary state, the instructions of ProcC are currentlybeing executed by processor 110. That is, ProcC is active. The currentactive frame of stacked register file 140 includes registers 250, whichare allocated to ProcC. ProcB, which called ProcC, is inactive, andProcA, which called ProcB, is inactive. ProcB and ProcA are parentprocedures. For the disclosed embodiment of register management system200, data is transferred between execution core 130 and registers 250(the active frame) responsive to the instructions of ProcC. RSE 150implements speculative spill and fill operations on registers 230 and240, which are allocated to inactive procedures, ProcA and ProcB,respecitvely. Unallocated registers 260, 270 appear above and belowallocated registers 230, 240, 250 in register file 140

[0028] For the disclosed embodiment of register file 140, the size ofthe current active frame (registers 250) is indicated by a size of frameparameter for ProcC (SOF_(c)). The active frame includes registers thatare available only to ProcC (local registers) as well as registers thatmay be used to share data with other procedures (output registers). Thelocal registers for ProcC are indicated by a size of locals parameter(SOL_(c)). For inactive procedures, ProcA and ProcB, only localregisters are reflected in register file 140 (by SOL_(a) and SOL_(b),respectively). The actual size of the corresponding frames, when active,are indicated through frame-tracking registers, which are discussed ingreater detail below.

[0029]FIG. 3 represents a series of register allocation/deallocationoperations in response to procedure calls and returns for one embodimentof computer system 100. In particular, FIG. 3 illustrates theinstructions, register allocation, and frame tracking that occur whenProcB passes control of processor 110 to ProcC and when ProcC returnscontrol of processor 110 to ProcB.

[0030] At time (I), the instructions of ProcB are executing on theprocessor, i.e. ProcB is active. ProcB has a frame size of 21 registers(SOF_(b)=21), of which 14 are local to ProcB (SOL_(b)=14) and 7 areavailable for sharing. A current frame marker (CFM) tracks SOF and SOLfor the active procedure, and a previous frame marker (PFM) tracks SOFand SOL for the procedure that called the current active procedure.

[0031] ProcB calls ProcC, which is initialized with the output registersof ProcB and no local registers (SOL_(c)=0 and SOF_(c)=7) at time (II).For the disclosed embodiment, initialization is accomplished by renamingoutput registers of ProcB to output registers of ProcC. The SOF and SOLvalues for ProcB are stored in PFM and the SOF and SOL values of ProcCare stored in CFM.

[0032] ProcC executes an allocate instruction to acquire additionalregisters and redistribute the registers of its frame among local andoutput registers. At time (III), following the allocation, the currentactive frame for ProcC includes 19 registers, 16 of which are local. CFMis updated from (SOL_(c)=0 and SOF_(c)=7) to (SOL_(c)=16 andSOF_(c)=19). PFM is unchanged by the allocation instruction. When ProcCcompletes, it executes a return instruction to return control of theprocessor to ProcB. At time (IV), following execution of the returninstruction, ProcB's frame is restored using the values from PFM.

[0033] The above described procedure-switching may trigger the transferof data between register file 140 and backing store 220. Load and storeoperations triggered in response to procedure switching are termed“mandatory”. Mandatory store (“spill”) operations occur, for example,when a new procedure requires the use of a large number of registers,and some of these registers store data for another procedure that hasyet to be copied to backing store 210. In this case, RSE 150 issues oneor more store operations to save the data to backing store 210 beforeallocating the registers to the newly activated procedure. This preventsthe new procedure from overwriting data in the newly allocatedregisters.

[0034] Mandatory fill operations may occur when the processor returns toa parent procedure if the data associated with the parent procedure hasbeen evicted from the register file to accommodate data for anotherprocedure. In this case, RSE 150 issues one or more load operations torestore the data to the registers associated with the re-activatedparent procedure.

[0035] When forward progress of the newly activated (or re-activated)procedure is blocked by mandatory spill and fill operations, theprocessor stalls until these operations complete. This reduces theperformance of the processor.

[0036] The present invention provides a mechanism that speculativelysaves and restores (spills and fills) data from registers in inactiveframes to reduce the number of stalls generated by mandatory RSEoperations. Speculative operations allow the active procedure to usemore of the registers in register file 140 without concern foroverwriting data from inactive procedures that has yet to be backed-upor evicting data for inactive procedures unnecessarily.

[0037] For one embodiment of the invention, the register file ispartitioned according to the state of the data in different registers.These registers are partitioned as follows:

[0038] The Clean Partition includes registers that store data valuesfrom parent procedure frames. The registers in this partition have beensuccessfully spilled to the backing store by the RSE and their contentshave not been modified since they were written to the backing store. Forthe disclosed embodiment of the register management system, the cleanpartition includes the registers between the next register to be storedby the RSE (RSE.StoreReg) and the next register to be loaded by the RSE(RSE.LoadReg).

[0039] The Dirty Partition includes registers that store data valuesfrom parent procedure frames. The data in this partition has not yetbeen spilled to the backing store by the RSE. The number of registers inthe dirty partition (“ndirty”) is equal to the distance between apointer to the register at the bottom of the current active frame(RSE.BOF) and a pointer the next register to be stored by the RSE(RSE.StoreReg).

[0040] The Current Frame includes stacked registers allocated for use bythe procedure that currently controls the processor. The position of thecurrent frame in the physical stacked register file is defined byRSE.BOF, and the number of registers in the current frame is specifiedby the size of frame parameter in the current frame marker (CFM.sof).

[0041] The Invalid Partition includes registers outside the currentframe that do not store values from parent procedures. Registers in thispartition are available for immediate allocation into the current frameor for RSE load operations.

[0042] For one embodiment of the invention, RSE 150 tracks the registerfile partitions and initiates speculative load and store operationsbetween the register file and the backing store when the processor hasavailable bandwidth. Table 1 summarizes the parameters used to track thepartitions and the internal state of the RSE. The parameters are namedand defined in the first two columns, respectively, and the parametersthat are architecturally visible, e.g. available to software, areindicated in the third column of the table. Here, AR represents a set ofapplication registers that may be read or modified by software operatingon, e.g., computers system 100. The exemplary registers and instructionsdiscussed in conjunction with Tables 1-4 are from the IA64™ InstructionSet Architecture (ISA), which is described in Intel® IA64 ArchitectureSoftware Developer's Guide, Volumes 1-4, published by Intel® Corporationof Santa Clara, Calif. TABLE 1 Architectural Name Description LocationRSE.N_Stacked_(—) Number of stacked physical Phys registers in theparticular implementation of the register file RSE.BOF Number of thephysical register AR[BSP] at the bottom of the current frame. For thedisclosed embodiment, this physical register is mapped to logicalregister 32. RSE.StoreReg Physical register number of the AR[BSPSTORE]next register to be stored by the RSE RSE.LoadReg Physical registernumber that is RSE.BspLoad one greater than the next register to beloaded (modulo N_(—) Stacked_Phy). RSE.BspLoad Points to the 64-bitbacking store address that is 8 bytes greater than the next address tobe loaded by the RSE RSE.NATBitIndex 6-bit wide RNAT collection BitAR[BSPSTORE] Index-defines which RNAT (8:3) collection bit gets updatedRSE.CFLE Current Frame load enable bit- control bit that permits the RSEto load regsieter in the current frame after a branch return or returnfrom interrupt (rfi)

[0043]FIG. 4 is a schematic representation of the operations implementedby RSE 150 to transfer data speculatively between register file 140 andbacking store 210. Various partitions 410, 420, 430 and 440 of registerfile 140 are indicated along with the operations of RSE 150 on thesepartitions. For the disclosed embodiment, partition 410 comprises theregisters of the current (active) frame, which stores data for ProcC.

[0044] Dirty partition 420 comprises registers that store data from aparent procedure which has not yet been copied to backing store 210. Forthe disclosed embodiment of register management system 200, dirtypartition 420 is delineated by the registers indicated throughRSE.StoreReg and RSE.BOF. For the example of FIG. 2, dirty partition 420includes some or all local registers allocated to ProcB and, possibly,ProcA, when the contents of these registers have not yet been copied tobacking store 210.

[0045] Clean partition 430 includes local registers whose contents havebeen copied to backing store 210 and have not been modified in themeantime. For the example of FIG. 2, clean partition may includeregisters allocated to ProcA and, possibly, ProcB. Invalid partition 440comprises register that do not currently store valid data for anyprocedures.

[0046] RSE 150 monitors processor 110 and executes store operations(RSE_Stores) on registers in dirty partition 420 when bandwidth isavailable in the memory channel. For the disclosed embodiment of theinvention, RSE.StoreReg indicates the next register to be spilled tobacking store 210. It is incremented as RSE 150 copies data fromregister file 140 to backing store 210. RSE_Stores are opportunisticstore operations that expand the size of clean partition 430 at theexpense of dirty partition 420. RSE_Stores increase the fraction ofregisters in register file 140 that are backed up in backing store 210.These transfers are speculative because the registers may be reaccessedby the procedure to which they were originally allocated before they areallocated to a new procedure.

[0047] RSE 150 also executes load operations (RSE_Loads) to registers ininvalid partition 440, when bandwidth is available in the memorychannel. These opportunistic load operations increase the size of cleanpartition 430 at the expense of invalid partition 440. For the disclosedembodiment, RSE.LoadReg indicates the next register in invalid partition440 to which RSE 150 restores data. By speculatively repopulatingregisters in invalid partition 440 with data, RSE 150 reduces theprobability that mandatory loads will be necessary to transfer data frombacking store 210 to register file 140 when a new procedure is (re)activated. The transfer is speculative because another procedure mayrequire allocation of the registers before the procedure associated withthe restored data is re-activated.

[0048] For one embodiment of the invention, RSE 150 may operate indifferent modes, depending on the nature of the application that isbeing executed. In all modes, mandatory spill and fill operations aresupported. However, some modes may selectively enable speculative spilloperations and speculative fill operations. A mode may be selecteddepending on the anticipated register needs of the application that isto be executed. For example, a register stack configuration (RSC)register may be used to indicate the mode in which RSE 150 operates.Table 2 identifies four RSE modes, the types of RSE loads and RSE storesenabled for each mode, and a bit pattern associated with the mode. TABLE2 RSE Mode RSE Loads RSE Stores RSC.mode Enforced Lazy Mode MandatoryMandatory 00 Store Intensive Mode Mandatory Mandatory + 01 SpeculativeLoad Intensive Mode Mandatory + Mandatory 10 Speculative Eager ModeMandatory + Mandatory + 11 Speculative Speculative

[0049]FIG. 5 is a flowchart representing one embodiment of a method formanaging data transfers between a backing store and a register file.Method 500 checks 510 for mandatory RSE operations. If a mandatory RSEoperation is pending, it is executed. If no mandatory RSE operations arepending, method 500 determines 530 whether there is any availablebandwidth in the memory channel. If bandwidth is available 530,speculative one or more RSE operations are executed 540 and the RSEinternal state is updated 550. If no bandwidth is available 530, method500 continues monitoring 510, 530 for mandatory RSE operations andavailable bandwidth.

[0050]FIG. 6 represents one embodiment of a state machine 600 that maybe implemented by RSE 15. State machine 600 includes a monitor state610, an adjust state 620 and a speculative execution state 630. Forpurposes of illustration, it is assumed that speculative RSE_loads andRS_stores are both enabled for state machine 600, i.e. it is operatingin eager mode.

[0051] In monitor state 610, state machine 600 monitors processor 110for RSE-related instructions (RI) and available bandwidth (BW). RIs areinstructions that may alter portions of the architectural state of theprocessor that are relevant to the RSE (“RSE state”). The RSE may haveto stall the processor and implement mandatory spill and fill operationsif these adjustments indicate that data/registers are not available inthe register stack. The disclosed embodiment of state machine 600transitions to adjust state 620 when an RI is detected and implementschanges to the RSE state indicated by the RI. If the RSE state indicatesthat mandatory spill or fill operations (MOPs) are necessary, these areimplemented and the RSE state is adjusted accordingly. If no MOPs areindicated by the state change (!MOP), state machine 600 returns tomonitor state 610.

[0052] For one embodiment of the invention, RIs includeload-register-stack instructions (loadrs), flush-register-stackinstructions (flushrs), cover instructions, register allocationinstruction (alloc), procedure return instructions (ret) andreturn-from-interrupt instructions (rfi) instructions. that may alterthe architectural state of processor 110 as well as the internal stateof the RSE. The effects of various RIs on the processor state for oneembodiment of register management system 200 are summarized below inTables 3 and 4.

[0053] If no RIs are detected and bandwidth is available for speculativeRSE operations (BW && !RIs), state machine 600 transitions from monitorstate 610 to speculative execution state 630. In state 630, statemachine 600 may execute RSE_Store instructions for inactive registerframes and adjust its register tracking parameter (StoreReg)accordingly, or it may execute RSE_Load instructions on inactiveregister frames and adjust its memory pointer (BspLoad) and registertracking parameter (LoadReg) accordingly.

[0054] State machine 600 transitions from speculative execution state630 back to monitor state 610 if available bandwidth dries up (!BW).Alternatively, detection of an RI may cause a transition fromspeculative execution state 630 to adjust state 620. TABLE 3INSTRUCTIONS AFFECTED Alloc RFI STATE (r_(I) = ar.pfs, I, l, o, r)Branch-Call Branch-Return (CR[IFS].v = 1) AR[BSP] {63:3} UnchangedAR[BSP]{63:3} + AR[BSP]{63:3} − AR[BSP]{63:3} CFM.sol + AR[PFS].pfm.sol− (62 − CR[IFS].ifm.sof − (62 − (AR[BSP]{8:3} + AR[BSP]{8:3} +AR[BSP][8:3} + CFM.sol)/63 AR[PFS].pfm.sol)/63 + CR[IFS].ifm.sof)/63CFM.sol)/63 AR[PFS] Unchanged AR[PFS].pfm = CFM Unchanged UnchangedAR[PFS].pec = AR[EC] AR[PFS].ppl = PSR.cpl GR[r_(I)] AR[PFS] N/A N/A N/ACFM CFM.sof = i + l +o CFM.sof = CFM.sol AR[PFS].pfm CR[IFS].ifm CFM.sol= i+ l CFM.sol = 0 OR CFM.sor = r>>3 CFM.sor = 0 CFM.sof =o CFM.rrb.gr =0 CFM.sol = 0 CFM.rrb.fr = 0 CFM.sor = 0 CFM.rrb.p = 0 CFM.rrb.gr = 0CFM.rrb.fr = 0 CFM.rrb.p = 0

[0055] TABLE 4 INSTRUCTION AFFECTED STATE Cover Flushrs Loadrs AR[BSP]{63:3} AR[BSP]{63:3} + CFM.sof + Unchanged Unchanged (AR[BSP]{8:3} +CFM.sof)/63 AR[BSPSTORE]{63:3} Unchanged AR[BSP]{63:3} AR[BSP]{63:3} −AR[RSC].loadrs{14:3} RSE.BspLoad[63:3} Unchanged Model specificAR[BSP]{63:3} − AR[RSC].loadrs{14:3} AR[RNAT] Unchanged UpdatedUndefined RSE.RNATBitIndex Unchanged AR[BSPSTORE]{8:3} AR[BSPSTORE]{8:3}CR[IFS] If(PSR.ic = = 0) {CR[IFS].ifm = Unchanged Unchanged CFMCR[IFS].v = 1 CFM CFM.sof =o Unchanged Unchanged CFM.sol = 0 CFM.sor = 0CFM.rrb.gr = 0 CFM.rrb.fr = 0 CFM.rrb.p = 0

[0056] The present invention thus provides a register management systemthat supports more efficient use of a processor's registers. A registerstack engine employs available bandwidth in the processor-memory channelto speculatively spill and fill registers allocated to inactiveprocedures. The speculative operations increase the size of the registerfile's clean partition, reducing the need for mandatory spill and filloperations which may stall processor execution.

[0057] The disclosed embodiments of the present invention are providedsolely for purposes of illustration. Persons skilled in the art ofcomputer architecture and having the benefit of this disclosure willrecognize variations on the disclosed embodiments that fall within thespirit of the present invention. The scope of the present inventionshould be limited only by the appended claims.

What is claimed is:
 1. A computer system comprising: a memory; aregister file coupled to the memory through a memory channel, theregister file to store data for one or more procedures in one or moreframes, respectively; and a register stack engine to monitor activity onthe memory channel and to transfer data between selected frames of theregister file and the memory responsive to available bandwidth on thememory channel.
 2. The computer system of claim 1, wherein the memoryincludes a backing store and the register stack engine transfers databetween the selected frames and the backing store.
 3. The computersystem of claim 1, wherein a portion of the register file is organizedas a register stack.
 4. The computer system of claim 3, wherein theregister stack engine includes a first pointer to indicate a firstlocation in a current frame of the register stack.
 5. The computersystem of claim 4, wherein the register stack engine includes a secondpointer to indicate an oldest dirty register in the register stack. 6.The computer system of claim 5, wherein the register stack engineincludes a third pointer to indicate an oldest clean register in theregister stack.
 7. The computer system of claim 1, wherein registers ofthe register file are mapped to a current frame and an inactive frame,and the register stack engine transfers data between registers in theinactive frame and the backing store.
 8. The computer system of claim 7,wherein the registers mapped to the inactive frame are designated asclean or dirty, according to whether data stored in the registers has orhas not been spilled to the memory.
 9. The computer system of claim 8,wherein the memory includes a backing store.
 10. The computer system ofclaim 9, wherein the register stack engine transfers data from a dirtyregisters to a corresponding location in the backing store whenbandwidth is available on the memory channel.
 11. The computer system ofclaim 9, wherein the register stack engine transfers data to a cleanregister from a corresponding location in the backing store whenbandwidth is available on the memory channel.
 12. A method for managingdata in a register stack comprising: designating registers in theregister stack as clean or dirty, according to whether data in theregisters has been spilled to a backing store; monitoring operations ona memory channel; and spilling data from a current oldest dirty registerto the backing store when capacity is available on the memory channel.13. The method of claim 12, further comprising updating a first pointerto indicate a new oldest dirty register when data is spilled from thecurent oldest dirty register.
 14. The method of claim 12, furthercomprising filling data from the backing store to a current oldest cleanregister when capacity is available on the memory channel.
 15. Themethod of claim 14, further comprising updating a second pointer toindicate a new oldest clean register when data is filled to the curentoldest clean register.
 16. A computer system comprising: a memorysystem; a register file to store data for an active procedure and one ormore inactive procedures; and a register stack engine to transfer databetween registers associated with the one or more inactive proceduresand the memory system, responsive to available bandwidth to the memorysystem.
 17. The computer system of claim 16, wherein the computer systemfurther comprises a load/store unit and the register stack enginemonitors the load/store unit to determine available bandwidth to thememory system.
 18. The computer system of claim 16, wherein the registerstack engine includes a first pointer to track a next inactive registerto spill to the memory system and a second pointer to track a nextinactive register to fill from the memory system responsive to availablebandwidth.
 19. The computer system of claim 16, wherein the registerstack engine transfers data for inactive procedures responsive to a modestatus indicator.
 20. The computer system of claim 19, wherein theregister stack engine operates in a lazy mode, a store intensive mode, aload intensive mode, or an eager mode according to the mode statusindicator.
 21. The computer system of claim 19, wherein the mode statusindicator is set under software control responsive to a type ofapplication to run on the computer system.